AI029

Reinforcement Learning: An Introduction

Function Approximation and Policy Gradient Methods

Lecture

Lesson 9

Date

2026-04-21

Teacher

AI Tutor

Duration

60 Mins

Learning Objectives

Identify the limitations of tabular methods in high-dimensional state spaces.
Formulate the value function approximation problem using Mean Squared Value Error (VE).
Derive the Policy Gradient Theorem and its application in the REINFORCE algorithm.
Analyze the benefits of Actor-Critic architectures for reducing variance in policy updates.